Convolutional neural networks for approximating electrical and thermal conductivities of Cu-CNT composites

This article explores the deep learning approach towards approximating the effective electrical and thermal conductivities of copper (Cu)-carbon nanotube (CNT) composites with CNTs aligned to the field direction. Convolutional neural networks (CNN) are trained to map the two-dimensional images of stochastic Cu-CNT networks to corresponding conductivities. The CNN model learns to estimate the Cu-CNT composite conductivities for various CNT volume fractions, interfacial electrical resistances, Rc = 20 Ω–20 kΩ, and interfacial thermal resistances, R″t,c = 10−10–10−7 m2K/W. For training the CNNs, the hyperparameters such as learning rate, minibatch size, and hidden layer neurons are optimized. Without iteratively solving the physical governing equations, the trained CNN model approximates the electrical and thermal conductivities within a second with the coefficient of determination (R2) greater than 98%, which may take longer than 100 min for a convectional numerical simulation. This work demonstrates the potential of the deep learning surrogate model for the complex transport processes in composite materials.

ANN was trained with 357 examples from literature for various alloying elements along with their strengthening efficiencies. The strengthening efficiencies approximated by the ML model were comparable to those of experiments with accuracy greater than 90%. Another research used ANN to predict the multiaxial strain-sensing response of CNT-polymer composites 11 . The ML model employed physics-based FEM at microscale to generate 15,000 examples to train ANN and approximated the macro-scale strain responses in CNT-polymer composites with accuracy of 99.65%. One previous record developed and trained the Gaussian Process Regression (GPR) model to predict the tensile strength in CNT-polymers nanocomposites 12 . The training data was collected from the available literature with 23 different polymers, combined with 22 CNT incorporating methods and 20 CNT modifications. The GPR model exhibited strong performance in predicting the tensile strength of CNT-polymer composites with training and validation accuracy of greater than 91%.
In this article, a convolutional neural network (CNN) is presented that infers the electrical and thermal conductivity of Cu-CNT composites at room temperature (27 °C) when an input data describing the stochastic distribution of CNTs, CNT volume fraction and Cu-CNT interfacial resistance is provided. The CNN model learns the important features from the images of Cu-CNT networks to predict the conductivities. To improve the accuracy of the CNN model, the influence of various hyperparameters such as learning rate, batch size and number of neurons in hidden layers is investigated. The trained CNN can serve as a surrogate model for Cu-CNT composite systems if the morphology of CNT network can be expressed in two-dimensional (2D) image format. For example, if the 2D images of Cu-CNT composites that sharply visualize the boundaries of CNTs, obtained either from computational modeling or processed microscopic images, are available, the trained CNN can rapidly examine the composite properties before conducting the expensive FEM or actual measurements.

Training data generation
Training data is generated by creating the 2D stochastic Cu-CNT networks and simulating their electrical and thermal conductivities. A 2D finite element model (FEM) is used for the simulation that accounts for the CNT volume fractions, f, Cu-CNT interfacial resistances, and CNT-CNT interfacial resistances arising from the van der Waals interaction between two closely spaced CNTs. Since full details of FEM are available elsewhere 13 , only a minimal description follows. The 2D FEM model employs a simplified CNT morphology, i.e., straight CNTs aligned to the field direction, enabling the simulations of CNT networks with high volume fractions (up to 80%) at reduced computational costs. Several studies have reported that aligned, straightened CNTs exhibit enhanced electrical and thermal conductivities than entangled, randomly oriented CNTs [14][15][16][17][18] . Figure 1a illustrates some examples of Cu-CNT network models with various f. The 2D composite consists of non-overlapping CNTs (length 500 nm and width 10 nm) which are randomly distributed in the Cu matrix. Figure 1b shows the electrical and thermal boundary conditions used in FEM, which represent the following configurations: (1) steady-state electrical conduction and (2) heat conduction without internal heat generation. For electrical analysis, a potential difference, ΔV, of 1 μV is applied across the domain of length, L. For thermal analysis, the  At Cu-CNT interfaces, the interfacial electrical resistance (R c ) and interfacial thermal resistance (R ″ t,c ) are defined in the ranges of R c = 20 Ω-20 kΩ and R ″ t,c = 10 −10 -10 −7 m 2 K/W. The FEM estimates the electrical potential and temperature distributions in the Cu-CNT composite that are needed for the computation of effective electrical conductivity (σ e ) and thermal conductivity (k e ). The conductivities are normalized by the Cu matrix electrical conductivity (σ Cu = 0.58 × 10 8 S/m) 13 and thermal conductivity (k Cu = 401 W/mK) 13 at room temperature.
The training dataset is collected using FEM simulations and data augmentation. Figure 2 summarizes the data preparation process. First, 20 different images of Cu-CNT networks with random CNT distributions were generated for each target CNT fraction. Since 6 CNT volume fractions (i.e., f = 5%, 10%, 20%, 50%, 70% and 80%) were considered, in total, 120 Cu-CNT network images were created. Three-channel RGB images of Cu-CNT networks were converted into single-channel gray images to reduce the size of data. The information of Cu-CNT interfacial resistance was encoded in the Cu-CNT network image through a color code. The color intensity of the Cu domain was chosen by assigning grayscale intensities representing R c or R ″ t,c , while the CNT regions were represented by white color (i.e., pixel intensity of 255). The pixel intensity of the Cu domain was varied as 0, 63, 129, 163 to encode four different levels of R c and R ″ t,c . The total number of images after the color modification is increased to 480. The amount of training data was amplified using a simple image transformation technique, similar to a previous work 4 . As shown in Fig. 2, the original images were flipped in three ways: (1) horizontal, (2) vertical and (3) diagonal flips. The transformed Cu-CNT networks were assumed to possess identical conductivities to their original Cu-CNT network. With the data augmentation, the total number of Cu-CNT network models is increased to 1920. Finally, the Cu-CNT network images and tabulated electrical and thermal conductivities from FEM simulations were paired as the training dataset.

Convolutional neural network
Convolutional neural network (CNN) is a class of deep neural networks which is widely-used in image recognition tasks with remarkable success 19 . There are several CNN models with different structures successfully applied for image recognition such as AlexNet 20 , ResNet 21 , LeNet-5 22 , etc. The CNN model outperforms other machine learning algorithms in terms of non-linear function approximation and the ability to extract and articulate data features 23 . Thus, compared to conventional artificial neural networks such as multilayer perceptron and feedforward networks, the CNN significantly reduces the computational demands when processing high-dimensional image information due to the feature parameter sharing and dimensionality reduction. Figure 3 shows the  www.nature.com/scientificreports/ architecture of our CNN model obtained through hyperparameter tuning which is discussed in the next section. The CNN model consists of an input layer (i.e., Cu-CNT network), an output layer (i.e., predicted conductivities) and 6 hidden layers. The input layer is a single channel Cu-CNT network image, equivalent to a 228 × 228 × 1 matrix. The image size was chosen to retain high resolution and capture minuscule details of CNT networks, particularly at high CNT fractions. A convolution layer is added to generate feature maps from the input layer.
The convolutional layer contains a series of 3 × 3 kernels which are convoluted with inputs to extract features while preserving the spatial relationships between image pixels. The batch normalization layer is added after every convolution layer to normalize and standardize the inputs between 0 and 1. A rectified linear unit activation (ReLU) layer is added to prevent the vanishing gradient problem, allowing the model to learn faster with improved stability. To down-sample the input feature map, a pooling layer with a filter size of 2 and stride of 2 is inserted after every activation layer. The pooling layer applies an average pooling operation in a prescribed filter size and abstracts the input feature maps, reducing the low-level features while extracting high-order features. After 6 iterations of hidden layers, a fully connected layer takes all the outputs in the previous layer and connects them to its single neuron, i.e., a one-dimensional feature vector. The feature vector represents the major features of the original input and can be used to establish the regression model for the electrical or thermal conductivities.
To train the CNN model, stochastic gradient descent (SGD) algorithm is used. SGD is one of the popular iterative optimization techniques for determining weights that minimize the errors in neural networks. SGD calculates the gradients on small randomized subsets of the training set, called minibatch. The gradient is calculated in small-steps called learning rate which determines the moving step size from one point to the next point with a negative gradient. After a full forward and backward pass on the complete training dataset, i.e., 1 epoch, the model weights are updated. By testing with a minibatch in the range of 5-20 and learning rate in the range of 10 −2 -10 −7 , we selected an optimal minibatch size as 20, a learning rate as 10 −3 and epochs as 400. The learning rate was dropped by a factor of 0.1 after every 150 epochs, allowing the model to learn an optimal set of weights. The model training begins by initiating the kernel parameters using Gaussian initialization method which extracts the features of the Cu-CNT network. The kernel parameters are optimized according to the Euclidean loss function, (1/n) n i=1 y i − y i ′ 2 , which calculates the square sum of the difference between the two training outputs, i.e., predictive value, y i and known value, y i ′. The loss function is subsequently minimized after each iteration by updating the parameters.

Results and discussion
The number of neurons in hidden layers was adjusted to balance the model accuracy and training time. The coefficient of determination (R 2 ) was employed to quantitatively examine the model accuracy. Table 1 Figure 4 compares the CNN model approximations and FEM predictions for σ e /σ Cu and k e /k Cu . Overall, the training of CNN was successful with R 2 Train ≥ 0.99, and the trained CNN was able to accurately predict the unseen Cu-CNT network models with R 2 Valid ≥ 0.98. Note that training the CNN with 1920 Cu-CNT models took only ~ 3 min. With this training cost, the CNN model can estimate the conductivity of an unseen Cu-CNT network within 1 s, whereas the FEM requires ~ 155 min on average for the same task. Such characteristics of the CNN model suggest that the deep learning approach is a promising method when it is necessary to rapidly and repetitively estimate the properties of stochastic composite materials if the training dataset, i.e., images of composite materials and corresponding properties, is available.
The training and validation datasets were designed to include diversified examples with various CNT fractions and interfacial resistances. The diversity in training data critically affects whether the neural network is able to overcome the bias or not. In our dataset, σ e /σ Cu ranges from 0.08 to 10.45 and k e /k Cu ranges from 0.15 to 4.25 as shown in Fig. 4. For the data generated with a large interfacial resistance (i.e., R c = 20 kΩ and R ″ t,c = 10 −7 m 2 K/W), the Cu-CNT composites with high f (i.e., f ≥ 50%) possessed effective conductivities that were smaller than that For the examples with a large R c , R ″ t,c and small f (i.e., f < 20%), the effective conductivities were close to unity. When the interfacial resistance is small (i.e., R c = 20 Ω and R ″ t,c = 10 −10 m 2 K/W), the examples with high f (i.e., f ≥ 50%) exhibited effective conductivities that were greater than that of copper (i.e., 7.5 < σ e /σ Cu < 11 and 2 < k e /k Cu < 4.5). By combining various levels of f, R c and R ″ t,c , the dataset incorporated the examples having effective conductivities similar to previously reported Cu-CNT composites [24][25][26][27][28][29][30][31] .
The method introduced in this article demonstrates that the deep neural networks can rapidly approximate the complex relation between the morphology of fiber composites and their electrical and thermal transport properties. The introduced approach will be useful for the researchers who need a surrogate model for fiber composite systems that estimates the composite properties before the expensive finite element simulations or actual measurements. Thus, the application of the introduced approach for inferring the properties of actual composite materials can be an extension of this work. Since the images of Cu-CNT composites used in this work showed the shapes of CNTs distinctly without any blurriness, the CNN readily recognized the layouts of CNTs and made predictions accurately. For the application of the introduced approach to actual materials, it will be necessary to acquire microscopic images of the samples from various parts and process the images to extract the morphology of CNT network similar to Fig. 1a while eliminating the background image features.

Conclusions
This work reports a CNN that is trained to approximate the effective electrical and thermal conductivities of stochastic Cu-CNT networks when their 2D images are provided as inputs. The CNN architecture and hyperparameters were optimized to make approximations with R 2 > 0.98. Despite the complex and nonlinear transport mechanism, the CNN predicted for unseen Cu-CNT networks of various CNT volume fractions and Cu-CNT interfacial resistances with the R 2 greater than 98%. To provide a variety of learnable examples in CNN without performing additional FEM simulations, a simple image augmentation technique was used to diversify the training dataset by 4-folds. A possible extension of this work is to investigate the potential of CNN or other deep learning methods as rapid prediction models for microscopic images of fabricated bulk-scale Cu-CNT networks or other composite materials. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.